AITopics | adaptive 0

Collaborating Authors

adaptive 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Strategies to Minimize Out-of-Distribution Effects in Data-Driven MRS Quantification

Merkofer, Julian P., Kaiser, Antonia, Schrantee, Anouk, Gurney-Champion, Oliver J., van Sloun, Ruud J. G.

arXiv.org Machine LearningDec-1-2025

This study systematically compared data-driven and model-based strategies for metabolite quantification in magnetic resonance spectroscopy (MRS), focusing on resilience to out-of-distribution (OoD) effects and the balance between accuracy, robustness, and generalizability. A neural network designed for MRS quantification was trained using three distinct strategies: supervised regression, self-supervised learning, and test-time adaptation. These were compared against model-based fitting tools. Experiments combined large-scale simulated data, designed to probe metabolite concentration extrapolation and signal variability, with 1H single-voxel 7T in-vivo human brain spectra. In simulations, supervised learning achieved high accuracy for spectra similar to those in the training distribution, but showed marked degradation when extrapolated beyond the training distribution. Test-time adaptation proved more resilient to OoD effects, while self-supervised learning achieved intermediate performance. In-vivo experiments showed larger variance across the methods (data-driven and model-based) due to domain shift. Across all strategies, overlapping metabolites and baseline variability remained persistent challenges. While strong performance can be achieved by data-driven methods for MRS metabolite quantification, their reliability is contingent on careful consideration of the training distribution and potential OoD effects. When such conditions in the target distribution cannot be anticipated, test-time adaptation strategies ensure consistency between the quantification, the data, and the model, enabling reliable data-driven MRS pipelines.

adaptive, concentration, spectra, (17 more...)

arXiv.org Machine Learning

2511.23135

Country:

Europe > Netherlands > North Brabant > Eindhoven (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.92)
Health & Medicine > Health Care Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Evaluating Sparse Autoencoders for Monosemantic Representation

Fereidouni, Moghis, Haider, Muhammad Umair, Ju, Peizhong, Siddique, A. B.

arXiv.org Artificial IntelligenceOct-20-2025

A key barrier to interpreting large language models is polysemanticity, where neurons activate for multiple unrelated concepts. Sparse autoencoders (SAEs) have been proposed to mitigate this issue by transforming dense activations into sparse, more interpretable features. While prior work suggests that SAEs promote monosemanticity, no quantitative comparison has examined how concept activation distributions differ between SAEs and their base models. This paper provides the first systematic evaluation of SAEs against base models through activation distribution lens. We introduce a fine-grained concept separability score based on the Jensen-Shannon distance, which captures how distinctly a neuron's activation distributions vary across concepts. Using two large language models (Gemma-2-2B and DeepSeek-R1) and multiple SAE variants across five datasets (including word-level and sentence-level), we show that SAEs reduce polysemanticity and achieve higher concept separability. To assess practical utility, we evaluate concept-level interventions using two strategies: full neuron masking and partial suppression. We find that, compared to base models, SAEs enable more precise concept-level control when using partial suppression. Building on this, we propose Attenuation via Posterior Probabilities (APP), a new intervention method that uses concept-conditioned activation distributions for targeted suppression. APP achieves the smallest perplexity increase while remaining highly effective at concept removal.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.15094

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Mitigating Disparate Impact of Differentially Private Learning through Bounded Adaptive Clipping

Zhao, Linzh, Rehn, Aki, Heikkilä, Mikko A., Tajeddine, Razane, Honkela, Antti

arXiv.org Machine LearningJun-3-2025

Differential privacy (DP) has become an essential framework for privacy-preserving machine learning. Existing DP learning methods, however, often have disparate impacts on model predictions, e.g., for minority groups. Gradient clipping, which is often used in DP learning, can suppress larger gradients from challenging samples. We show that this problem is amplified by adaptive clipping, which will often shrink the clipping bound to tiny values to match a well-fitting majority, while significantly reducing the accuracy for others. We propose bounded adaptive clipping, which introduces a tunable lower bound to prevent excessive gradient suppression. Our method improves the accuracy of the worst-performing class on average over 10 percentage points on skewed MNIST and Fashion MNIST compared to the unbounded adaptive clipping, and over 5 percentage points over constant clipping.

accuracy, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2506.01396

Country:

Europe > Austria > Vienna (0.14)
Europe > Netherlands (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)
(15 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.68)
Government > Regional Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation

Zhou, Yuxuan, Keuper, Margret, Fritz, Mario

arXiv.org Artificial IntelligenceAug-24-2024

Sampling-based decoding strategies have been widely adopted for Large Language Models (LLMs) in numerous applications, which target a balance between diversity and quality via temperature tuning and tail truncation (e.g., top-k and top-p sampling). Considering the high dynamic range of the candidate next-token given different prefixes, recent studies propose to adaptively truncate the tail of LLM's predicted distribution. Although improved results haven been reported with these methods on open-ended text generation tasks, the results are highly dependent on the curated truncation parameters and exemplar text. In this paper, we propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step, based on our collected prefix tree which preserves the context of a full sentence. Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.

probability, risk level, truncation, (16 more...)

arXiv.org Artificial Intelligence

2408.13586

Country:

North America > United States (0.04)
Europe > Germany > Saarland (0.04)
Asia > India (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

On the (Un-)Avoidability of Adversarial Examples

Chowdhury, Sadia, Urner, Ruth

arXiv.org Machine LearningJun-24-2021

The phenomenon of adversarial examples in deep learning models has caused substantial concern over their reliability. While many deep neural networks have shown impressive performance in terms of predictive accuracy, it has been shown that in many instances an imperceptible perturbation can falsely flip the network's prediction. Most research has then focused on developing defenses against adversarial attacks or learning under a worst-case adversarial loss. In this work, we take a step back and aim to provide a framework for determining whether a model's label change under small perturbation is justified (and when it is not). We carefully argue that adversarial robustness should be defined as a locally adaptive measure complying with the underlying distribution. We then suggest a definition for an adaptive robust loss, derive an empirical version of it, and develop a resulting data-augmentation framework. We prove that our adaptive data-augmentation maintains consistency of 1-nearest neighbor classification under deterministic labels and provide illustrative empirical evaluations.

predictor, robust loss, robustness, (16 more...)

arXiv.org Machine Learning

2106.13326

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Split Modeling for High-Dimensional Logistic Regression

Christidis, Anthony-Alexander, Van Aelst, Stefan, Zamar, Ruben

arXiv.org Machine LearningFeb-17-2021

A novel method is proposed to learn an ensemble of logistic classification models in the context of high-dimensional binary classification. The models in the ensemble are built simultaneously by optimizing a multi-convex objective function. To enforce diversity between the models the objective function penalizes overlap between the models in the ensemble. We study the bias and variance of the individual models as well as their correlation and discuss how our method learns the ensemble by exploiting the accuracy-diversity trade-off for ensemble models. In contrast to other ensembling approaches, the resulting ensemble model is fully interpretable as a logistic regression model and at the same time yields excellent prediction accuracy as demonstrated in an extensive simulation study and gene expression data applications. An open-source compiled software library implementing the proposed method is briefly discussed.

adaptive 0, split-en 0, split-lasso 0, (13 more...)

arXiv.org Machine Learning

2102.08591

Country:

North America > United States > New York (0.04)
North America > Canada > British Columbia (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Online Multi-Label Classification: A Label Compression Method

Ahmadi, Zahra, Kramer, Stefan

arXiv.org Machine LearningApr-4-2018

Many modern applications deal with multi-label data, such as functional categorizations of genes, image labeling and text categorization. Classification of such data with a large number of labels and latent dependencies among them is a challenging task, and it becomes even more challenging when the data is received online and in chunks. Many of the current multi-label classification methods require a lot of time and memory, which make them infeasible for practical real-world applications. In this paper, we propose a fast linear label space dimension reduction method that transforms the labels into a reduced encoded space and trains models on the obtained pseudo labels. Additionally, it provides an analytical method to update the decoding matrix which maps the labels into the original space and is used during the test phase. Experimental results show the effectiveness of this approach in terms of running times and the prediction performance over different measures. Keywords: data stream classification, multi-label data, label compression 1. Introduction Standard classification is the task of assigning the correct class to previously unknown test instances based on training instances. Training data consist of a set of features and an associated target class or class label. Many modern data mining applications, however, need to deal with more than one label per instance.

classification, dataset, proceedings, (17 more...)

arXiv.org Machine Learning

1804.01491

Country:

Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
Oceania > New Zealand > North Island > Waikato (0.04)
Asia > Middle East > Lebanon (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback